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Abstract 

How much can randomness help computation? Motivated by this 
general question and by volume computation, one of the few instances 
where randomness provably helps, we analyze a notion of dispersion 
and connect it to asymptotic convex geometry. We obtain a nearly 
quadratic lower bound on the complexity of randomized volume algo- 
rithms for convex bodies in M. n (the current best algorithm has com- 
plexity roughly n 4 , conjectured to be n 3 ). Our main tools, dispersion 
of random determinants and dispersion of the length of a random point 
from a convex body are of independent interest and applicable more 
generally; in particular, the latter is closely related to the variance hy- 
pothesis from convex geometry This geometric dispersion also leads 
to lower bounds for matrix problems and property testing. 



1 Introduction 

Among the most intriguing questions raised by complexity theory is the 
following: how much can the use of randomness affect the computational 
complexity of algorithmic problems? At the present time, there are many 
problems for which randomized algorithms are simpler or faster than known 
deterministic algorithms but only a few known instances where randomness 
provably helps. 

One problem for which randomness makes a dramatic difference is esti- 
mating the volume of a convex body in M. n . The convex body can be accessed 
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as follows: for any point x £ M. n , we can determine whether x is in the body 
or not (a membership oracle). The complexity of an algorithm is measured 
by the number of such queries. The work of Elekes [12] and Barany and 
Fiiredi [I] showed that any deterministic polynomial-time algorithm cannot 
estimate the volume to within an exponential (in n) factor. We quote their 
theorem below. 

Theorem 1 Q4j). For every deterministic algorithm that uses at most n a 
membership queries and given a convex body K with B n C K C nB n outputs 
two numbers A,B such that A < vol (if) < B, there exists a body K' for 
which the ratio B/A is at least 



where c is an absolute constant. 

In striking contrast, the celebrated paper of Dyer, Frieze and Kannan 
|10| gave a polynomial-time randomized algorithm to estimate the volume 
to arbitrary accuracy (the dependence on n was about n 23 ). This result 
has been much improved and generalized in subsequent work (n 16 , [T7j; n 10 , 
PH [2]; n 8 , [9]; n 7 , [IS]; n 5 , [H]; n 4 , [T9]); the current fastest algorithm 
has complexity that grows as roughly 0(n 4 /e 2 ) to estimate the volume to 
within relative error 1 + e with high probability (for recent surveys, see 
\22\ [23] ) . Each improvement in the complexity has come with fundamental 
insights and lead to new isoperimetric inequalities, techniques for analyzing 
convergence of Markov chains, algorithmic tools for rounding and sampling 
logconcave functions, etc.. 

These developments lead to the question: what is the best possible com- 
plexity of any randomized volume algorithm? A lower bound of £l(n) is 
straightforward. Here we prove a nearly quadratic lower bound: there is a 
constant c > such that any randomized algorithm that approximates the 
volume to within a (1 + c) factor needs 0(n 2 /logn) queries. The formal 
statement appears in Theorem [2j 

For the more restricted class of randomized nonadaptive algorithms (also 
called "oblivious"), an exponential lower bound is straightforward (Section 
15.11) . Thus, the use of full-fledged adaptive randomization is crucial in effi- 
cient volume estimation, but cannot improve the complexity below n 2 / log n. 

In fact, the quadratic lower bound holds for a restricted class of convex 
bodies, namely parallelopipeds. A parallelopiped in W 1 centered at the origin 
can be compactly represented using a matrix as {x : H-AxH^ < 1}, where A 
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is aniixi! nonsingular matrix; the volume is simply 2 n |det(^4)| _1 . One way 
to interpret the lower bound theorem is that in order to estimate |det(^4)| 
one needs almost as many bits of information as the number of entries of 
the matrix. The main ingredient of the proof is a dispersion lemma which 
shows that the determinant of a random matrix remains dispersed even after 
conditioning the distribution considerably. We discuss other consequences 
of the lemma in Section El 

Our lower bound is nearly the best possible for this restricted class of 
convex bodies. Using 0(n 2 log n) queries, we can find a close approxima- 
tion to the entire matrix A and therefore any reasonable function of its 
entries. This naturally raises the question of what other parameters require 
a quadratic number of queries. We prove that estimating the product of 
the lengths of the rows of an unknown matrix A to within a factor of about 
(1 + 1/ log n) also requires £l(n 2 / log n) queries. The simplest version of this 
problem is the following: given a membership oracle for any unknown half- 
space a ■ x < 1, estimate ||a||, the Euclidean length of the normal vector 
a (alternatively, estimate the distance of the hyperplane from the origin). 
This problem can be solved deterministically using 0(n log n) oracle queries. 
We prove that any randomized algorithm that estimates ||a|| to within an 
additive error of about 1 / \/log n requires Q(n) oracle queries. 

Related earlier work includes [5j [8], showing lower bounds for linear 
decision trees (i.e., every node of the tree tests whether an affine function 
of the input is nonnegative) . [5] considers the problem of deciding whether 
given n real numbers, some k of them are equal, and they prove that it 
has complexity 0(n log(n/A:)). [8] proves that the n-dimensional knapsack 
problem has complexity at least n 2 /2. 

For these problems (length, product of lengths), the main tool in the 
analysis is a geometric dispersion lemma that is of independent interest 
in asymptotic convex geometry. Before stating the lemma, we give some 
background and motivation. There is an elegant body of work that studies 
the distribution of a random point X from a convex body K j3j EJ [7J 121] . 
A convex body K is said to be in isotropic position if vol(-fT) = 1 and for a 
random point X we have 

E(X) = 0, and E(XX T ) = al for some a > 0. 

We note that there is a slightly different definition of isotropy (more conve- 
nient for algorithmic purposes) which does not restrict vol(-fT) and replaces 
the second condition above by K(XX T ) = I. Any convex body can be 
put in isotropic position by an affine transformation. A famous conjecture 



3 



(isotropic constant) says that a is bounded by a universal constant for every 
convex body. It follows that E(||X|| 2 ) = 0(n). Motivated by the analysis of 
random walks, Lovasz and Vempala made the following conjecture (under 
either definition). If true, then some natural random walks are significantly 
faster for isotropic convex bodies. 

Conjecture 1. For a random point X from an isotropic convex body, 

var(||X|| 2 ) = O(n). 

The upper bound of 0(n) is achieved, for example, by the isotropic 
cube. The isotropic ball, on the other hand, has the smallest possible value, 
var( X\\ 2 ) = 0(1). The variance lower bound we prove in this paper (The- 
orem [6]) directly implies the following: for an isotropic convex polytope P 
in R n with at most poly(n) facets, 

«(||xf ) - n (JL.) . 

Thus, the conjecture is nearly tight for not just the cube, but any isotropic 
polytope with a small number of facets. Intuitively, our lower bound shows 
that the length of a random point from such a polytope is not concentrated 
as long as the volume is reasonably large. Roughly speaking, this says that 
in order to determine the length, one would have to localize the entire vector 
in a small region. 

Returning to the analysis of algorithms, one can view the output of 
a randomized algorithm as a distribution. Proving a lower bound on the 
complexity is then equivalent to showing that the output distribution after 
some number of steps is dispersed. To this end, we define a simple parameter 
of a distribution: 

Definition 1. Let /i be a probability measure on R. For any < p < 1, the 
p-dispersion of fi is 

disp^ (p) = inf{|a - b\ : a, b £ R, fi([a, b]) > 1 - p}. 

Thus, for any possible output z, and a random point X, with probability 
at least p, \X — z\ > disp M (p) /2. We prove some useful properties about 
this parameter in Section [3l 
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2 Results 

2.1 Complexity lower bounds 

We begin with our lower bound for randomized volume algorithms. Besides 
the dimension n, the complexity also depends on the "roundness" of the 
input body. This is the ratio R/r where rB n C K C RB n . To avoid another 
parameter in our results, we ensure that R/r is bounded by a polynomial in 
n. 

Theorem 2 (volume). Let K be a convex body given by a membership oracle 
such that B n C K C 0(n s )B n . Then there exists a constant c > such that 
any randomized algorithm that outputs a number V such that (1— c) vol(.fr) < 
V < (1 + c) vol(i^) holds with probability at least 1 — 1/n has complexity 
0(n 2 /logn). 

We note that the lower bound can be easily extended to any algorithm 
with success probability p > 1/2 with a small overhead [E] . The theorem 
actually holds for parallelepipeds with the same roundness condition, i.e., 
convex bodies specified by an n x n real matrix A as {x G M n : V 1 < i < 
n \Ai ■ x\ < 1} where Ai denotes the i'th row of A. In this case, the volume 
of K is simply 2 n |det(A)| _1 . We restate the theorem for this case. 

Theorem 3 (determinant). Let A be an matrix with entries in [—1,1] and 
smallest singular value at least 2 -12 n -7 that can be accessed by the following 
oracle: for any x, the oracle determines whether < 1 is true or false. 

Then there exists a constant c > such that any randomized algorithm that 
outputs a number V such that 



holds with probability at least 1 — 1/n, has complexity Q(n 2 / logn). 

A slightly weaker lower bound holds for estimating the product of the 
lengths of the rows of A. The proof is in Sectional 

Theorem 4 (product). Let A be an unknown matrix that can be accessed by 
the following oracle: for any x, the oracle determines whether \ \Ax\loo < 1 
is true or false. Then there exists a constant c > such that any randomized 
algorithm that outputs a number L such that 



(1 - c)|det(A)| < V < (1 + c)|det(A)| 




with probability at least 1 — 1/n has complexity £2(n 2 / log n) . 
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When A has only a single row, we get a stronger bound. In this case, 
the oracle is simply a membership oracle for a halfspace. 



Theorem 5 (length). Let a be a vector in [—1, l] n with \\a\\ > yjn — 4-^/log n 
and a ■ x < 1 be the corresponding halfspace in M n given by a membership or- 
acle. Then there exists a constant c > such that any randomized algorithm 
that outputs a number I such that 

< i < Ml + 



with probability at least 1 — 1/n has complexity at least n — 1. 

The restrictions on the input in all the above theorems ("roundness") 
only make them stronger. For example, the bound on the length of a above 
implies that it only varies in an interval of length 4-y/log n. To pin it down 
in an interval of length c/ydogn (which is O(loglogn) bits of information) 
takes f2(re) queries. This result is in the spirit of hardcore predicates [T3] . 

It is worth noting that a very simple algorithm can approximate the 
length as in the theorem with probability at least 3/4 and 0(n log 2 n) 
queries: the projection of a onto a given vector b can be computed up 
to an additive error of 1/ poly (re) in O(logre) queries (binary search along 
the line spanned by b). If b is random in 5 n _i, then E((a • b) 2 ) = \\a\\ 2 /n. A 
Chernoff-type bound gives that the average of 0(n log n) random projections 
allows the algorithm to localize ||a|| in an interval of length 0(l/\/logn) with 
probability at least 3/4. 



2.2 Variance of polytopes 

The next theorem states that the length of a random point from a polytope 
with few facets has large variance. This is a key tool in our lower bounds. 
It also has a close connection to the variance hypothesis (which conjectures 
an upper bound for all isotropic convex bodies), suggesting that polytopes 
might be the limiting case of that conjecture. 

Theorem 6. Let P C R n be a polytope with at most n k facets and contained 
in the ball of radius n q . For a random point X in P, 

var IIXII 2 > vol(P)« + ^^e- c(fc+39) -^- 

log n 

where c is a universal constant. 
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Thus, for a polytope of volume at least 1 contained in a ball of radius at 
most poly(n), with at most poly(n) facets, we have var \\X\\ 2 = Q(n/ logn). 
In particular this holds for any isotropic polytope with at most poly(n) 
facets. The proof of Theorem [6] is given in Section [71 

2.3 Dispersion of the determinant 

In our proof of the volume lower bound, we begin with a distribution on 
matrices for which the determinant is dispersed. The main goal of the proof 
is to show that even after considerable conditioning, the determinant is still 
dispersed. The next definition will be useful in describing the structure of 
the distribution and how it changes with conditioning. 

Definition 2. Let M be a set of n x n matrices. We say that M is a 
product set along rows if there exist sets Mi C W 1 , 1 < i < n, 

M = {M : VI < i < n, Mi G Mi}. 

Let B n denote the n-dimensional Euclidean unit ball centered at the 
origin. 

Lemma 7. There exists a constant c > such that for any partition 
{A^}j£N of {^/nB Tl ) n into \N\ < 2 n ~ 2 parts where each part is a product 
set along rows, there exists a subset JV'CJV such that 

a. vol(U i6JV ' A j ) > ±vol((^B n ) n ) and 

b. for any u > and a random point X from A 3 for any j G N 1 , we have 

Pr(|detX| i [n,n(l + C )]) > -L. 

3 Preliminaries 

Throughout the paper, we assume that n > 12 to avoid trivial complications. 

We define irv(u) to be the projection of a vector a to a subspace V. 
Given a matrix R, let Ri denote the i'th row of R, and let R be the matrix 
having the rows of R normalized to be unit vectors. Let Ri be the projection 
of Ri to the subspace orthogonal to R\, . . . , Ri-i- For any row Ri of matrix 
R, let R-i denote (the span of) all rows except Ri. So tt r ± (Ri) is the 
projection of Ri orthogonal to the subspace spanned by all the other rows 
of R. 

The volume of the Euclidean unit ball is given by W7r(n/2 + l), and 
its surface area is 2-7r n//2 /T(n/2). 
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3.1 Dispersion 

We begin with two simple cases in which large variance implies large disper- 
sion. 

Lemma 8. Let X be a real random variable with finite variance a 2 , 
a. If the support of X is contained in an interval of length M then 



b. If X has a logconcave density then disp^ (p) > (1 —p)o~. 
Proof. Let a, b £ R be such that b — a < a. Let a = Pr(JT ^ [a, b]). Then 

varX < (I -a) (j^-j + <*M 2 . 

This implies 

a > 



AM 2 ' 

For the second part, Lemma 5.5(a) from [20] implies that a logconcave 
density with variance a 2 is never greater than I /a. This implies that if 
a, b G R are such that Pr(X G [a, b]) > p then we must have b — a > pa. □ 

Lemma 9. Let X, Y be real-valued random variables and Z be a random 
variable that is generated by setting it equal to X with probability a and equal 
to Y with probability 1 — a. Then, 

disp z (ap) > disp x (p) . 

Lemma 10. Let f : [0, M] — > M+ be a density function with mean /i and 
variance a 2 . Suppose the distribution function of f is logconcave. Then 
f can be decomposed into a convex combination of densities g and h, i.e., 
f(x) = ag(x) + (1 — a)h(x), where g is uniform over an interval [a, b], with 
a> fi and a(a - b) 2 = Q(a 2 / log(M/<r)) . 

This lemma is proved in Section [6j 
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3.2 Yao's lemma 



We will need the following version of Yao's lemma. Informally, the proba- 
bility of failure of a randomized algorithm v on the worst input is at least 
the probability of failure of the best deterministic algorithm against some 
distribution fi. 

Lemma 11. Let fi be a probability measure on inputs I (a "distribution on 
inputs") and let v be a probability measure on deterministic algorithms A (a 
"randomized algorithm"). Then 

inf Pr (algorithm a fails on measure fi) 



Let / be a set (a subset of the inputs of a computational problem, for 
example the set of all well-rounded convex bodies in IR n for some n). Let O 
be another set (the set of possible outputs of a computational problem, for 
example, real numbers that are an approximation to the volume of a convex 
body). Let A be a set of functions from I to O (these functions represent 
deterministic algorithms that take elements in / as inputs and have outputs 
in O). Let C : I x A — > ]R (for a £ A and i £ I, C(i, a) is a measure of the 
badness of the algorithm a on input i, such as the indicator of a giving a 
wrong answer on i). 

Lemma 12. Let fi and v be probability measures over L and A, respectively. 
Let C : / x A — > ]R be integrable with respect to fx y. v. Then 



Proof. By means of Fubini's theorem and the integrability assumption we 
have 



< sup Pr (randomized algorithm v fails on input i). 



inf E^j) C(i, a) < supE i/(a) C(i, a) 



K(a) C(i, a) = E M(i) E„( o) C{i, a). 



Also 




and 



E M (i) E i/(a) C(i,a) < supE u{a) C(i,a). 



□ 
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Proof (of Lemma[Tl}). Let C : I x A — > K, where for i £ I, a £ A we have 



Then the consequence of Lemma [12] for this C is precisely what we want to 



3.3 The query model and decision trees 

We have already discussed the standard query model (let us call HQ): A 
membership oracle for a convex body K takes any q £ IR n and outputs YES 
if q & K and NO otherwise. When K is a parallelopiped specified by a 
matrix A, the oracle outputs YES if HAg)^ < 1 and NO otherwise. 

It is useful to view the computation of a deterministic algorithm as a 
decision tree representing the sequence of queries: the nodes (except the 
leaves) represent queries, the root is the first query made by the algorithm 
and there is one query subtree per answer. The leaves do not represent 
queries but instead the answers to the last query along every path. Any leaf 
/ has a set Pi of inputs that are consistent with the corresponding path of 
queries and answers on the tree. Thus the set of inputs is partitioned by the 
leaves. 

To prove our main lower bound results for parallelopipeds, it will be 
convenient to consider a modified query model Q' that can output more 
information: Given q £ R n , the modified oracle outputs YES as before if 
1 1 Aq | |oo < 1; otherwise it outputs a pair (i, s) where i is the "least index 
among violated constraints", i = min{j : \Ajq\ > 1}, and s G { — 1,1} 
is the "side", s = sign(j4j<7). An answer from Q' gives at least as much 
information as the respective answer from Q, and this implies that a lower 
bound for algorithms with access to Q' is also a lower bound for algorithms 
with access to Q. The modified oracle Q' has the following useful property 
(see Definition [2]) : 

Lemma 13. If the set of inputs is a product set along rows, then the leaves 
of a decision tree in the modified query model Q' induce a partition of the 
input set where each part is itself a product set along rows. 

Proof. We start with M, a product set along rows with components Mi. 
Let us observe how this set is partitioned as we go down a decision tree. A 
YES answer imposes two additional constraints of the form — l<g-x<lon 
every set Mi. For a NO answer with response (i, s), we get two constraints 




prove. 



□ 



10 



for all A4j, 1 < j < i, one constraint for the i'ih set and no new constraints 
for the remaining sets. Given this information, a particular setting of any 
row (or subset of rows) gives no additional information about the other 
rows. Thus, the set of possible matrices at each child of the current query 
is a product set along rows. The lemma follows by applying this argument 
recursively. □ 

Apart from the product property given by the previous lemma, if one 
assumes additionally that the set of inputs is convex, then in the query 
model Q' each part of the partition is a convex set. This property is used in 
the proof of the product lower bound (Theorem 0]), but is not used in the 
volume lower bound (Theorem [2]) . Thus, for the volume lower bound one 
could use an oracle like Q' that outputs the index i but not the sign s, and 
the product property would be preserved (Lemma fT3l) but not the convexity. 

3.4 Distributions and concentration properties 

We use two distributions on n x n matrices called D and D' for the lower 
bounds in this paper. A random matrix from D is obtained by selecting each 
row independently and uniformly from the ball of radius y/n. A random ma- 
trix from D' is obtained by selecting each entry of the matrix independently 
and uniformly from the interval [—1,1]. In the analysis, we will also en- 
counter random matrices where each entry is selected independently from 
iV(0, 1). We use the following property. 

Lemma 14. Let a be the minimum singular value of an n x n matrix G 
with independent entries from N(0, 1). For any t > 0, 



Proof. To bound <r, we will consider the formula for the density of A = a 2 
given in Theorem 3.1]: 



Pr (o-y/n <t) <t. 



/(A) 



2 n-i/2 r(n/2) 



n T(n) 



x-y 2 e~ Xn ' 2 u 



( 



n — 1 



2 



2 2 



1 A 



) 



where U is the Tricomi function, which satisfies for all A > 0: 



. 1/(2^,-1,0) =r(3/2)/r((n + 2)/2) 
. [7(2=1,-1^) >0 
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(The first two properties are from Theorem 3.1], the third from [1] 13.1.3 
and 13.4.21].) 

We will now prove that for any n the density function of t = Vn\ is at 
most 1. To see this, the density of t is given by 



t 2 \ 2t 



9(t)=f - - = 2/(AW- 



n l n 



n T(n) Xn/2 



n 2™- 3 /2 r(n/2) 



U 



n-1 1 A 
2 '~2'2 



Now, 
d 



v n T(n) 
dt »W 2n _ 3/2 r(n/2) x 



-e~ Xn / 2 U 
2 



n-1 1 A 

2 '~2'2 



+ e -An/2 d ^ 

dX 



n-1 1 A 

2 '~2'2 



2t 

— < 0. 

n 



Thus, the maximum of # is at t = 0, and 

01 r(n) T(3/2) 
2"- 3 / 2 r(n/2) r(2±?) 

It follows that Pr(cry / n < a) < a. 



< 1. 



□ 



Lemma 15. Let X be a random n- dimensional vector with independent 
entries from iV(0, 1). Then for e > 

Pr(||X|| 2 > (l + e)n) < ((1 + e)e" e ) n/2 

and for e £ (0, 1) 

Pr(||X|| 2 < (1 - e)n) < ((1 - e)e e ) n/2 . 
For a proof, see \2i\ Lemma 1.3]. 

Lemma 16. Let X be a uniform random vector in the n-dimensional ball 
of radius r. Let Y be an independent random n-dimensional unit vector. 
Then, 



E(\\X\\ 2 ) 



nr 



n + 2 

Proof. For the first part, we have 

,2, Jl^dt 



and K((X-Y) 2 ) 



n + 2 



E(\\X\ 



nr 



/ r t n_1 dt n + 2' 
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For the second part, because of the independence and the symmetry we can 
assume that Y is any fixed vector, say (1,0, ... ,0). Then E((X • Y) 2 ^ = 
E(X 2 ). But 



E(X 2 ) = E(X 2 ) = • • • = - Y E(X 2 ) = !HJ 



2\ 2 

r 



n ^— ' n n + 2 

i=l 



□ 



Lemma 17. There exists a constant c > sitc/i i/ia£ ?/ P C R n compact 
and X is a random point in P then 

E||X|| 2 > c(vo\P) 2/n n 

1 1 2 

Proof. For a given value of volP, the value E ||X|| is minimized when P is 
a ball centered at the origin. For some c > we have that the volume of 
the ball of radius r is 

7J -n/2 r n 27j- n / 2 r n 27T n / 2 r n c n/2 r n 
> r- > 



T(n/2 + l) nY{n/2) ~ n(n/2) n / 2 ~ n n / 2 

This implies that, for a given value of volP, the radius r of the ball of that 
volume satisfies 

„n/2 r n 

On the other hand, Lemma [TBI claims that for Y a random point in the ball 
of radius r, we have 

E||Y|| 2 = ^1. (2) 
11 11 n + 2 y 1 

Combining (fTJ), ([2]) and the minimality of the ball, we get 

> volP 



o \ nil 

cE||X|| 2 (ra + 2n ' 



In 2 / 

and this implies the desired inequality. □ 

We conclude this section with two elementary properties of variance. 
Lemma 18. Let X, Y be independent real-valued random variables. Then 
var(XY) ( varX \ / vary \ varX varl" 



(e(xy)) 2 V (ex)VV (ey) 2 ; - (ei) 2 (ef) 2 ' 

Lemma 19. For real-valued random variables X,Y, varX = Ey var(X 
Y) + varyEpf | Y). 
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4 Lower bound for length estimation 



In this section, we prove Theorem [5j Let a be uniform random vector from 
[— 1, l] n . By Lemma [To], ||a|| > \fn — 4yTogn as required by the theorem 
with probability at least 1 — 1/n 2 . We will prove that there exists a constant 
c > such that any deterministic algorithm that outputs a number I such 
that 

n n 

< I < Hall + 



with probability at least 1 — 0(l/n log ri) makes at least n — 1 halfspace 
queries. Along with Yao's lemma this proves the theorem. 

Our access to a is via a membership oracle for the halfspace a ■ x < 1. 
Consider the decision tree of height h for some deterministic algorithm. 
This will be a binary tree. The distribution at a leaf I is uniform over 
the intersection of [— 1,1] n with the halfspaces given by the path (queries, 
responses) to the leaf I from the root r, i.e., uniform over a polytope Pi with 
at most 2n + h facets. 

The volume of the initial set is 2™. The volume of leaves with vol(Pj) < 1 
is less than \L\ = 2 h and so the total volume of leaves with vol(P/) > 1 is 
at least 2 n — 2 h . Setting h = n — 1, this is 2 n ~ l and so with probability at 
least 1/2, vol(P;) > 1. For a random point X from any such P/, Theorem^] 
implies that var ||^|| 2 > cn/logn for some absolute constant c > 0. Now by 
Lemma Ufa), and the fact that the support of \\X\\ 2 is an interval of length 
n, we get that for any b, 



Pr[\\\Xf-b\>^r^]> 



2 y log n J An log n 

It follows that \\X\\ is dispersed after n — 1 queries. We note that the lower 
bound can be extended to any algorithm that succeeds with probability 
1 — l/n e by a standard trick to boost the success probability: we repeat the 
algorithm 0(l/e) times and use the median of the results. 



5 Complexity of randomized volume algorithms 

We will use the distribution D on parallelopipeds (or matrices, equivalently) . 
Recall that a random n x n matrix R is generated by choosing its rows 
Ri , . . . , R n uniformly and independently from the ball of radius \/n. The 
convex body corresponding to R is a parallelopiped having the rows of R as 
facets' normals: 

{i e 1™ : (yi)\Ri-x\ < 1} 
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Its volume is V : R nxn -> R given (a.s.) by V(R) = 2 n |det J R|~ 1 . 

At a very high level, the main idea of the lower bound is the following: 
after an algorithm makes all its queries, the set of inputs consistent with 
those queries is a product set along rows (in the oracle model Q'), while the 
level sets of the function that the algorithm is trying to approximate, |det(-)|, 
are far from being product sets. In the partition of the set of inputs induced 
by any decision tree of height 0(n 2 /logn), all parts are product sets along 
rows and most parts have large volume, and therefore V is dispersed in most 
of them. To make this idea more precise, we first examine the structure of a 
product set along rows all with exactly the same determinant. This abstract 
"hyperbola" has a rather sparse structure. 

Theorem 20. Let R C R nxn be such that R = Ui=i R i, R i ^ Rn convex 
and there exists c > such that |det M\ = c for all M G R. Then, for some 
ordering of the Ri's, Ri C Si, with Si an — dimensional affine subspace, 
^ Si and satisfying: Si is a translation of the linear hull of Si-\. 

Proof. By induction on n. It is clearly true for n = 1. For arbitrary n, 
consider the dimension of the affine hull of each Ri, and let R± have minimum 
dimension. Let a £ R\. There will be two cases: 

If Ri = {a}, then let A be the hyperplane orthogonal to a. If we denote 
Ti the projection of Ri onto A, then we have that T = W^Zl Ti satisfies the 
hypotheses in A = M n_1 with constant c/||a|| and the inductive hypothesis 
implies that, for some ordering, the . . . ,T n are contained in affine sub- 
spaces not containing of dimensions 0, . . . , n — 2 in A, that is, i?2, . . . , R n 
are contained in affine subspaces not containing of dimensions 1, . . . , n — 1. 

If there are a,b € R\, b ^ a, then there is no zero-dimensional Ri. Also, 
because of the condition on the determinant, b is not parallel to a. Let x\ = 
\a+ (1 — X)b and consider the argument of the previous paragraph applied to 
x\ and its orthogonal hyperplane. That is, for every A there is some region 
Ti in A that is zero-dimensional. In other words, the corresponding Ri is 
contained in a line. Because there are only n — 1 possible values of i but an 
infinite number of values of A, we have that there exists one region Ri that 
is picked as the zero-dimensional for at least two different values of A. That 
is, Ri is contained in the intersection of two non-parallel lines, and it must 
be zero-dimensional, which is a contradiction. □ 

Now we need to extend this to an approximate hyperbola, i.e., a product 
set along rows with the property that for most of the matrices in the set, 
the determinant is restricted in a given interval. This extension is the heart 
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of the proof and is captured in Lemma [7J We will need a bit of preparation 
for its proof. 

We define two properties of a matrix R £ M nxn : 

• Property Pi(R,t): 117=1 11^.(^)11 < * ("short 1-D projections"). 

• Property P2(R,t): |detii| > t ("angles not too small"). 
Lemma 21. Let R be drawn from distribution D. Then for any a > 1, 

a. Pr(Pi(i2,a*)) >l-£, 

6. i/iere exists (3 > 1 (^/iaf depends on a) such that Pr(P2(R, 1//?™)) > 
Proof. For part (a), by the AM-GM inequality and Lemma [TBI we have 



e ( (n ii^ w ii 2 ) 1/n ) ^ ^ £ E & 
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n + 2 



Thus, by Markov's inequality, 



pr (n 11^(^)11 ^ c ") = pr ((n ikfl± ( (fl* 



> c 2 < 



c 2 ' 



For part (b), we can equivalently pick each entry of R independently as 
N(0, 1). In any case, 



n,ra aiift.ii' 

We will find an upper bound for the denominator and a lower bound for the 
numerator. 

For the denominator, Markov's inequality and the fact that ||i2i|| 2 = 
n n give 

Pr ^ni|i?i|| 2 > tn n ^j < 1/t. (3) 

2 2 

For the numerator, let //j = E = n — i + 1, let fi = ^Y\7=i \\Ri\\ = 

Now, concentration of a Gaussian vector (Lemma ll5p gives 

Pr(||^|| 2 >^/2) > 1 - 2^("- i+1 )/ 8 (4) 
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Alternatively, for t G (0, 1) the fact that the density of N(0, 1) is less 1 gives 



PT(\\Ri\\ 2 >tm) >l- Vi(n-i + l). (5) 

Let c> be such that 2-( n ~ i+1 )/ 8 < l/(2n Q+1 ) for i < n-clogn. Using 
inequality for i < n — c log n and ([5]) for the rest with 



2n 2o (clog n) 5 / 2 

we get 

Pr(fTM 2 > r^^-r-l 

I J_J_II Ml — 2 n_ c l°g n t C n / 

i=l ' 

n— clogn n 

> n pr (ii^n 2 >f) n pr(i^n 2 >^) w 



2 

1=1 i=ra— clogn 

1 

> 1 

~ n Q 

where, for some 7 > 1 we have 2 n ~ clogn t clogn < 7™. The result follows from 
equations (J6j) and (J3]). □ 

Proof (of Lemma 0). The idea of the proof is the following: If we assume 
that |det(-)| of most matrices in a part fits in an interval [u, u(l + e)], then 
for most choices R- n of the first n — 1 rows in that part we have that most 
choices Y of the last row in that part have |det(i?_ n , Y)\ in that interval. 
Thus, in view of the formulaU |det(i?_ n , Y)\ = \\Y\\ ni^i 1 11-^*11 we nave that, 
for most values of Y, 

n-l 



\Y\\ G [«,«(! + e)] J] ll^ll 1 



i=l 

where Y is the projection of Y to the line orthogonal to . . . ,R n -\. In 
other words, most choices of the last row are forced to be contained in a set 
of the form {x : b < \a ■ x\ < c}, that we call a double band, and the same 
argument works for the other rows. In a similar way, we get a pair of double 
bands of "complementary" widths for every pair of rows. These constraints 
on the part imply that it has small volume, giving a contradiction. This 
argument only works for parts containing mostly "matrices that are not too 



1 Recall that Ri is the projection of Ri to the subspace orthogonal to Ri, . . . , Ri- 
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singular" — matrices that satisfy Pi and P 2 — , and we choose the parameters 
of these properties so that at least half of {^JnB n ) n satisfies them. 

We will firstly choose N' as the family of large parts that satisfy prop- 
erties Pi and P2 for suitable parameters so that (a) is satisfied. We will 
say "probability of a subset of (y^riB n ) nn to mean its probability with re- 
spect to the uniform probability measure on (y/nB n ) n . The total probability 
of the parts having probability at most a is at most a|iV|. Thus, setting 
a = 1/(4|7V|), the parts having probability at least l/4|iV| > 1/2™ 2 have 
total probability at least 3/4. Since volUjgAr.A- 7 > 2" , each of those parts 
has volume at least 1. Let these parts be indexed by N" C TV. We choose 
parameters in Lemma l2~T1 (say, a = 4 for part (a), a = 2 for part (b), giving 
the existence of some (5) so that at least 7/8 of (y/nB n ) n satisfy Pl(-, 4 n ) and 
P 2 (-, l//? n ), and then at least 3/4 of the parts in probability satisfy Pi(-, 4 n ) 
and P 2 (; l/(3 n ) for at least half of the part in probability. Let N'" C N be 
the set of indices of these parts. Let N' = N" n N 1 " . We have that U j( z N >A j 
has probability at least 1/2. 

We will now prove (b). Let A = fJiLi ^ be one of the parts indexed by 
N' . Let X be random in A. Let e be a constant and Pi(n) be a function 
of n both to be fixed later. Assume for a contradiction that there exists u 
such that 

Pr(|detX| i [u,u(l + e)]) <pi(n). (7) 

Let G C A be the set ofMei such that |detM| G [u, u(l + e)] . Let p 2 (n), 
p^{n) be functions of n to be chosen later. Consider the subset of points 
R £ G satisfying: 

I. Pi(P,4") and P 2 (P,l//3 n ), 

II. for any i G {1, . . . ,n}, for at most a pi{n) fraction of Y £ Ai we have 
(Y, R-i) i G, and 

III. for any !,j£{l,...,n},!^j, for at most a p^{n) fraction of (Y, Z) G 
j4j x Aj we have (Y, Z, R-ij) £ G. 

Because of the constraints, such a subset is a 

1 - Pr(X i G) - Pr(X G G and not as I, II and III) > 

> 1 — pi m n — — n — i-^ 8) 

piy 1 2 p 2 {n) pa(n) 1 7 

fraction of A The function pi(ra) will be chosen at the end so that the right 
hand side is positive. Fix a matrix R = (R±, . . . , R n ) in that subset. 
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The constraints described in the first paragraph of the proof are formal- 
ized in Lemma [22l which, for all i, j, gives sets B^ (double bands, of the 
form {x : b < \a ■ x\ < c}), such that most of Ai is contained in f\^ =1 Bij. 
Lemma [22] is invoked in the following way: For each pair i,j with i < j, let 
E be the two-dimensional subspace orthogonal to all the rows of R except 
i,j. We set X\ (respectively X2) distributed as the marginal in E of the uni- 
form probability measure on A4 (respectively Aj). We also set a\ = 7rg;(i?j), 
02 = ftE(Rj), P = P3(n), q = P2{n) and u and e as here, while 7 will be 
chosen later. 

Let lij be the width of (each component of) the double band B^ . Then, 
according to Lemma [22| the following relations hold: 

hi < elkflx for any i, 

kj < 4e\\ir R ±XRi)\\\\Tr R ± XRj)\\/lji for i > j. 

1 J 

Since each double band has two components, the intersection of all the 
n bands associated to a particular region Ai, namely n" =1 -Bjj, is the union 
of 2 n congruent parallelopipeds. Thus, using properties Pi and P2 of R and 
fixing e as a sufficiently small constant, the "feasible region" defined by the 
double bands, B = f^iLi ^j=i^iji satisfies: 

volP<2- 2 n ^^ 
\detR\ n 

2 nr=i (eik R x.(^)iin}=2 4611^.(^)1111^.(^)11) 

\&etR\ n 

a e(5) 4 (V) ni Kx.(^)ir 

= 2 n — 

|deti?| n 

< l/4 n . 

Each region Ai is not much bigger than the intersection of the corresponding 
double bands Bi = n" =1 .B, 7 - as follows: restricting to the double band Ba 
removes at most a P2(n) fraction of Ai, each double band B^ for j < i 
removes at most a 7 fraction of Ai, and each double band Bij for j > i 
removes a p2{n) + {pz{ n )/l) fraction of Ai. We set 7 = l/4n 2 , P2(n) = 
l/(4n 2 ) and pa(n) = l/(16n 4 ) so that, as a fraction of vol^4j, vol-Bj is no 
less than 

,_,«(„)_ Q 7 _(»)(„(„) + «^)> 1/! , 
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Thus, vol A < 2 n Yo\B < l/2 n , which is a contradiction. The condition on 
Pi(n) given by Equation (JSj) is satisfied for pi(n) = l/(2 7 n 6 ). □ 

Lemma 22 (2-D lemma). Let X\,X 2 be two independent random vectors 
in ]R 2 with bounded support (not necessarily with the same distribution). Let 
X be a random matrix with rows Xi,X 2 . Assume that there exist u > 0, 
< e < 1 such that 

Pr(|detX| £ [u,u(l + e)}) < p. 

Let G = {M G M 2x2 : |det M\ G [u, u(l + e)]}. Let oi, a 2 G K 2 be such that 
(a\, 02) G G and 

Pr(X! : (X lt a 2 ) £ G) < q, Pv(X 2 : {X 2 , a\) £ G) < q. 

Let 7 > p/(l — q). Then there exist double bands Bij C R 2 , > 0, 
i,j G {1,2}, / > 0, 

B\i = jx : |a2 
B22 = \x : \ai 
B12 = Ix : |af 
-B21 = jx : 
such that 

Pr(Xi ^ Bu) < Pr(Xi ^ 5x2) < g + (p/ 7 ) 

Pr(X 2 $ B 2 i) < 7 Pr(^2 i B22) < q. 

Proof. The proof refers to Figure [T] which depicts the bands under consid- 
eration. 

A double band of the form {x : \a ■ x\ G has (additive or abso- 

lute) width v — u and relative (or multiplicative) width v/u. Consider the 
expansion |detX| = \\X 2 \\ W^x^- (^i)ll an d the definition of 02 to get 

Pr(|| M (Xi)|| i ||a 2 |r 1 [n,u(l + e)]) < q. 

That is, with probability at most q we have X\ outside of a double band of 
relative width 1 + e: 

J3n = {x : ||7r a x(x)|| G ||a 2 |r 1 [n,n(l + e)]}. 



■x\ G [fen, 6n +e||7r ±(ai) ||] I 

• x\ G [622, ^22 + e||7r a j_(a 2 )||] } 

• x\ G [612, 612 

■x\ G [621,621 +4e||7r a x(ai)||||7r x(a 2 )||/i]} 
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Figure 1: The 2-D argument. 



Because a\ £ Bu, the absolute width is at most e||7r a ±(ai)||. If we exchange 
the roles of a\ and 02 in the previous argument, we get a double band i?22- 
Let A be the set of a S M 2 satisfying: (0,02) £ G and with probability 
at most 7 over X2 we have (a, X2) ^ G. We have that 

Prpfi e A)>l-q--. 

7 

Consider a point C £ A that maximizes the distance to the span of a\. Sim- 
ilarly to the construction of -Bu, by definition of A and with probability at 
most 7 we have X2 outside of a double band of relative width 1 + e. We de- 
note it B' 2 i- In order to have better control of the angles between the bands, 
we want to consider a bigger double band parallel to Bu, the minimum such 
a band that contains the intersection of -B22 and B' 2 \- Call this band £21- 
Consider the line though the origin O parallel to C — a\, and points M and 
N where the boundary of one component of the double band B22 intersects 
the line, M is the point closest to the origin, N, the farthest. The boundary 
of B'21 intersects the boundary of Bu precisely at ±M and ±7V, because 
for any vector n£l 2 parallel to C — a\ we have |det(u, C)| = |det(«, a\)\. 
Consider the components of i? 21 and B22 containing M and N and let P be 
any of the other two points where the boundaries of those components meet. 
This implies that triangles Oa\C and PMN are similar. The width of £21 
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is at most 2x, where x = m&x{\\7r a ±(P — M)||, ||7r a x(P — iV)||}. Then, 

x y 
z V 

where I = ||7r a x(C)|| is the width of a band imposed on A by definition of 
C, y is the width of B 2 2, y < e||7r a _L (02) || , and z is the distance between C 
or a\ and the span of a 2 , whichever is larger, that is, 

z = max{|| 7r a ±(C)\\, ||7r a x( ai )||} < (1 + e) ||7r a x( ai )|| < 2||7r a x(a 1 )||. 

Thus, x < 2e||7r x(ai)||||7r x(o2)||/Z. Let B\ 2 be the band imposed on A by 
definition of C. □ 

We are now ready to prove the complexity lower bounds. 

Proof of Theorem In view of Yao's lemma, it is enough to prove a lower 
bound on the complexity of deterministic algorithms against a distribution 
and then a lower bound on the minimum singular value of matrices according 
to that distribution. The deterministic lower bound is a consequence of 
the dispersion of the determinant proved in Lemma [7] the bound on the 
minimum singular value is an easy adaptation of a bound on the minimum 
singular value of a Gaussian matrix given by Lemma [HI These two claims 
are formalized below. 

Claim 1: Let R be a random input according to distribution D. Then 
there exists a constant c > such that any deterministic algorithm that 
outputs a number V such that 

(1 - c)|det R\ < V < (1 + c)|det R\ 

with probability at least 1 — l/(2 8 n 6 ) makes more than 

n 2 -2 
log 2 (2n + l) 

queries in the oracle model Q' . 

Claim 2: Let A be an n x n random matrix from distribution D. Let a 
be the minimum singular value of A. Then for any t > 

Ti 

V-riaJn < t) < At H T 

(the choice of t = l/(2 12 n 6 ) proves Theorem [S]). 
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Proof of Claim 1: For a deterministic algorithm and a value of n, consider 
the corresponding decision tree. Let 

n 2 -2 

h < : ; r 

" log 2 (2n + l) 

be the height and L be the set of leaves of this tree. Let (Pi)i^l be the 
partition on the support of D induced by the tree. 

Every query has at most 2n + 1 different answers, and every path has 
height at most h. Thus, 

\L\ < (2n + l) h = 2 n2 " 2 . 

The sets Pi are product sets along rows by Lemma [T3| and hence by Lemma 
[7] we have that there exists a constant c > such that with probability at 
least l/(2 8 n 6 ) and for any a > we have that |det R\ is outside of [a, (l+c)a]. 
Claim 1 follows. 

Proof of Claim 2: We will bound ||A _1 || 2 = l/o~. To achieve this, we 
will reduce the problem to the case where the entries of the matrix are 
N(0, 1) and independent. We write A = GDE, where G has its entries 
independently as N(Q, 1), D is the diagonal matrix that normalizes the rows 
of G and E is another random diagonal matrix independent of (G, D) that 
scales the rows of GD to give them the length distribution of a random 
vector in ^/nB n . We have 

ll^lla^P^Hall^llallG- 1 ^. (9) 

Now, with probability at least 1 — n/2 n the diagonal entries of E are at least 
yfn/2. Thus, except for an event that happens with probability n/2 n , 

\\E~ 1 \\ 2 <2/y^n. (10) 

On the other hand, Lemma [T5l (with e = 3) implies that with probability at 
least 1 — n/2 n the diagonal entries of D^ 1 are at most 2yfn. Thus, except 
for an event that happens with probability n/2 n , 

\\D~ l \\ 2 < 2y/n. (11) 

From ©, CGI and §TQ), we get p _1 || 2 < 4|| J B~ 1 ||. Using Lemma EH 
which bounds the singular values for a Gaussian matrix, Claim 2 follows. □ 

Finally, Theorem [2] is a simple consequence. 
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Proof of Theorem [H It remains to prove that a parallelopiped given by a 
matrix A as in Theorem [3] contains B n /y/n and is contained in —B n when- 
ever a > 0, where a is the minimum singular value of A. The first inclusion 
is evident since the entries must be from [—1,1]. It is sufficient to prove 
the second inclusion for the vertices of the parallelopiped, i.e., solutions to 
Ax = b for any b G {—1, l} n - That is, x = A~ 1 b and therefore 

IM| < ||4 _1 || 2 ||6|| < y/n/a. 

□ 



5.1 Nonadaptive volume algorithms 

An algorithm is nonadaptive if its queries are independent of the input. 

Theorem 23 (nonadaptive lower bound). Let K be a convex body given 
by a membership oracle such that B n C K C 2nB n . Then any nonadap- 
tive randomized algorithm that outputs a number V such that .9vol(-?T) < 
V < l.lvol(-R') holds with probability at least 3/4 has complexity at least 

2e(n+2) " 

Proof. Consider the distribution on parallelepipeds induced by the following 
procedure: first, with equal probability choose one of the following bodies: 

• ("brick") {x G R n : (Vi G {2, . . . ,n}) \xi\ < 1} n nB n 

• ("double brick") {xeK": (Vi G {2, ... , n}) \x { \ < l} n 2nB n 

and then, independently of the first choice, apply a random rotation. 

We will prove the following claim, from which the desired conclusion can 
be obtained by means of Yao's lemma. 

Claim: Let K be a parallelopiped according to the previous distribution. 
Then any nonadaptive deterministic algorithm that outputs a number V 
such that 

.9vol(K) < V < l.lvol(iT) (12) 

holds with probability more than i + ^r(^) n ^ 2 has complexity at least Q. 

Proof of Claim: To satisfy Equation (|12p . the algorithm has to actually 
distinguish between the brick and the double brick. Let the bad surface be 
the intersection between the input and the sphere of radius n. In order to 
distinguish between the two bodies, the algorithm has to make at least one 
query whose ray hits the bad surface. We will prove that the probability 
of this event is no more than 2Q{2/ e-nn) 11 ^ 2 . To see this, observe that the 
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probability of a query hitting the bad surface is at most the volume of the 
bad surface divided by the volume of the sphere of radius n. The former can 
be bounded in the following way: Let x = (x2, ■ ■ ■ ,x n ) be the coordinates 
along the normals to the n — 1 facets of the body. Parameterize one of 
the hemispheres determined by the hyperplane containing those normals as 
F(x 2 , ...,x n ) = s/n 2 - x\ 



u 2 ■ ■ ■ X n 

We have that 

d . . 

-F(x) 



2 



dxi F{x) 

In the domain of integration [—1, l] n_1 we have ||x|| 2 < n and this implies 
that in that domain 

n z — \\x\\ n — l 
The volume of the bad surface is given by 



2 / \/l + \\VF(x)f dx<2 n Jl + < 2 n+1 

J^i^n-i v V n - 1 



The volume of the sphere of radius n is 

T(n/2) " (n/2)"/2 n ( > ' 
Thus, the probability that a particular query hits the bad surface is at most 



\nir J 



n/2 



Therefore the algorithm gives the wrong answer with probability at least 

K i -*(=D- 

□ 



6 Lower bound for the product 

Proof, (of Lemma [TUJ) Let the distribution function be F(t) = Pt(X < 
t) = e 9 ^ for some concave function g and the density is f(t) = g'(t)e 9 ^ 
where g'(t) is nonincreasing. First, we observe that logconcavity implies that 
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F(l^) > 1/4. To see this, let fi — I be the point where F(/jl — I) = F(/i)/2. 
Then, F(fi - il) < F(^)/2 i and 

/V - x)f(x) dx < - (i - 1)0 - F{n - il)) (il) 

J ° i>l 

< F{{i)l + Y,F(li-il)({i + l)-i)l 

i>l 



2 % 

i>0 



On the other hand (assuming F(fi) < 1/4, otherwise, there is nothing to 
prove) , 



Llog(l/F(/i))J 



(x- M )/(x)dx> ^ (2 i -2 i - 1 )F( fJ ,)(i-l)l 

i=i 



> log(l/F(/i)) z 



Therefore, we must have 2i ? ( / u) > log(l/i ? (//))/2 which implies -F(/x) > 1/4. 
Next, 

(/j,-x)f(x)dx> / (n-x)f{x)dx> F{jj,-l)l> -. 



'0 J 

Therefore, since /i is the mean, 



(x — /x)/(x) dx > -. 



It follows that 



(x- M ) 2 /(^)dx>-. (13) 

Suppose I < a/4. Then, 

/% - ,u) 2 /(x) ^ < - (i - 1)1) - F{fi - il)) (il) 2 

J i>l 

< F(^)l 2 + F{» ~ »0 ((* + I) 2 " i2 Y 



i>l 



<FMl 2 Y 2 ^±± = 5l 2 F(v)<a 2 /2. 

i>l 
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Since 



roo r[i poo 

a 2 = (x — /j,) 2 f (x) dx = / (x — fi) 2 f (x) dx + / (x — /i) 2 /(x) dx, 
Jo Jo J n 

we must have 

f°° a 2 
J {x - fi) 2 f(x)dx > —. 

Using this and (|13|) . we have (regardless of the magnitude of I), 

1 a 2 

(x-^ff{x)>-^. (14) 

Now we consider intervals to the right of /i. Let Jo = (/i, x$\ where xq 
is the smallest point to the right of \i for which j{xo) < 1/c {Jo could be 
empty). Let Jj, for i = 1, 2, . . . , m = 31og(Af/<r) + 14 be [xj_i,Xj] where 
Xi is the smallest point for which f(xi) < 1/(<t2*). For any t > if > /i, 
/(*') > f{t)F(t')/F{t) > f{t)F(n) > fit) /A. Therefore, the function / is 
approximately constant in any interval Jj for i > l. If xo > /i + c/64, then 
the interval [/i, /i + cr/64] satisfies the desired property (as f(x) > /(xo) for 
x in this interval, we can take a = /(xo)<t/64 = 1/64). Otherwise, 

(x - fi) 2 f{x) dx < a 2 /2 12 . 

Jo 

Also, 

roo 

(x - [i) 2 f{x) dx < 4M 3 /(x m ) < cr 2 /2 12 . 



Therefore, from ()14j) . for some i* > 1 we have 

(x - fi) 2 f(x)dx > 

The interval [jU,Xj*] then completes the proof: For this interval we can take 
a = f{xi*){xi* — fj,), and we have 

(x - fi) 2 f(x)dx < 8(xi* - n) 2 (xi* - Xj._i)/(xj.) 
< 8a(xi* — fi) 2 . 

□ 
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Proof of Theorem^ For this lower bound, we use the distribution D' on 
matrices. Let R be an n x n random matrix having each entry uniformly 
and independently in [—1,1]. On input R from distribution D' having rows 
. . . , R n ) and with probability at least 1/2 over the inputs, we consider 
algorithms that output an approximation to f(R) = Y\i\\Ri\\- The nex t 
claim for deterministic algorithms, along with Yao's lemma, proves Theorem 

m 

Claim: Suppose that a deterministic algorithm makes at most 

h- 2 



log 2 (2n + l) 



queries on any input R and outputs V . Then there exists a constant c > 
such that the probability of the event 



1 - f(R) < V < 1 + f(R) 

log n J \ log n J 

is at most 1 — 0(l/n). 

To prove the claim, we consider a decision tree corresponding to a de- 
terministic algorithm. Let Pi be the set of matrices associated with a leaf 
I. By Lemma [T3l we have that the set Pi is a product set along rows, that 
is Pi = Yl i TZi, where IZi C M. n is the set of possible choices of the row 
Ri consistent with I. The conditional distribution of R at a leaf I consists 
of independent, uniform choices of the rows from their corresponding sets. 
Moreover, the sets IZi are polytopes with at most f = 2n + 2h facets. Every 
query has at most 2n + 1 different answers, and every path has height at 

n 2 

most h. Thus, \L\ < (2n + l) h = 2~~ 1 . The total probability of the leaves 
having probability at most a is at most a\L\. Thus, setting a = 1/(2|L|), 
the leaves having probability at least 

1 1 

> 



2\L\ ~ 2 n2 /2 



have total probability at least 1/2. Because volU^iP; = 2" 2 , we have that 
those leaves have volume at least 2 n I 2 . Further, since Pi = Yli^ii we have 
that for such Pi at least n/2 of the IZiS have volume at least 1. Theorem 
[6] implies that for those var||iij|| 2 > fl(n/ log n). Along with the fact that 
1 1 -Rill < \fu and Lemma [T8l for a random matrix R from such a Pi, we get 

var(/(i?) 2 ) > ^ var(||^|| 2 ) Q 



{E(f(Ry)) 2 Y (E(||^f)) Vlogn 
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Thus, the variance of f(R) is large. However, this does not directly imply 
that f(R) is dispersed since the support of f(R) could be of exponential 
length and its distribution is not logconcave. 

Let X = niLi where Xi = ||-Ri|| 2 . To prove the lower bound, we need 
to show that dispjf (p) is large for p at least inverse polynomial in n. For 
i such that vol(7£j) > 1, we have varAj = to(n/ logn) by Theorem [6l As 
remarked earlier at least n/2 sets satisfy the volume condition and we will 
henceforth focus our attention on them. We also get 

E(Xi) > n/16 (15) 

from this. The distribution function of each Xi is logconcave (although not 
its density) and its support is contained in [0, n]. So by Lemma [TUl we can 
decompose the density fi of each Xi as fi(x) = pigi(x) + (l—pi)g l i {x). where 
gi is the uniform distribution over an interval [of, bi] of length L% and 

PiLf = to( U 2 ] and Pi = to ( ■ 1 2 
\ log n J \n log n 

We will assume that PiLf = en/ log 2 n and pj = J7(l/n 2 ). This can be 
achieved by noting that Lj is originally at most n and truncating the interval 
suitably. Let X[ be a random variable drawn uniformly from the interval 
[of, 6j]. Let y» = log A,-, / be a subset of {1, 2, ... , n} and Yj = Yliei logX^. 
The density of Yi is hi(t) = e t /Li for loga^ < t < log6j and zero outside 
this range. Thus Yi has a logconcave density and so does Yj (the sum of 
random variables with logconcave density also has a logconcave density). 
Also, var(Y/) = Ylie.1 var (^i)- To bound the variance of Yi, we note that 
since a% > E(Xj) > n/16 by LemmaflOland Equation (|15p . we have bi < 16aj 
and so hi(t) varies by a factor of at most 16. Thus, we can decompose hi 
further into h\ and h" where h\ is uniform over [log aj , log bi] and 

1 1 5 

*,(*) = -h'^x) + ^'(x). 

Let Y- have density /i^. Then 

fv\ ^ 1 rv^ (log bj log fl^) 2 
var(Fi) > — var(yj = — . 



Therefore 



var(ij) > — ^(log b { - log a { ) 

iei 
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From this we get a bound on the dispersion of Yj using the logconcavity of 
Yj and Lemma [8fb). The bound depends on the set I of indices that are 
chosen. This set is itself a random variable defined by the decompositions 
of the XiS. We have 



n 1 n j-2 

1 = 1 8=1 V ly 



Cl 



log 2 n 



On the other hand, 



var/(var(Y 7 )) < ^pi(log&i - loga^ 
Li 



i=l 
n 



i=i 



< 



16 4 



2 r4 



_ 16 4 c 2 n 2 " 1 
n 4 log 4 nf^Pi' 

Suppose pj > C2/n for all i. Then we get, 

c ' 

var/(var(F/)) < - — \- 
log n 

and for C2 large enough, var/(var(Y/)) < (E/ var(Y/)) 2 /4. Hence, using 
Chebychev's inequality, with probability at least 1/4, var(lj) > ci/(41og 2 n). 
By Lemma[8^b), with probability at least 1/4, we have dispy 7 (1/2) > 4 ]^" n ■ 
This implies that for any u, 



Pr X G 



u, u\ 1 + 



41ogn 



7 

< -. 



Finally, if for some i, pi < C2/n, then for that Y{, L 2 = S7(?i 2 / log 2 n) and 
using just that i, we get dispy. (pi/2) > Lf/af = 0(1/ log 2 n) and once 
again X is dispersed as well (recall that pi = Q(l/n 2 )). □ 
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7 Variance of polytopes 



Let X S K be a random point in a convex body K. Consider the parameter 
ok of K denned as 



It has been conjectured that if K is isotropic, then o\ < c for some universal 
constant c independent of K and n (the variance hypothesis). Together with 
the isotropic constant conjecture, it implies Conjecture [TJ Our lower bound 
(Theorem[6]) shows that the conjecture is nearly tight for isotropic polytopes 
with at most poly(n) facets and they might be the limiting case. 

We now give the main ideas of the proof of Theorem [6l It is well-known 
that polytopes with few facets are quite different from the ball. Our theorem 
is another manifestation of this phenomenon: the width of an annulus that 
captures most of a polytope is much larger than one that captures most of a 
ball. The idea of the proof is the following: if G P, then we bound the vari- 
ance in terms of the variance of the cone induced by each facet. This gives 
us a constant plus the variance of the facet, which is a lower-dimensional 
version of the original problem. This is the recurrence in our Lemma [241 If 
^ P (which can happen either at the beginning or during the recursion), 
we would like to translate the polytope so that it contains the origin without 

1 1 1 1 2 

increasing var ||X|| too much. This is possible if certain technical conditions 
hold (case 3 of Lemma [24"|) . If not, the remaining situation can be handled 
directly or reduced to the known cases by partitioning the polytope. It is 
worth noting that the first case (0 G P) is not generic: translating a convex 
body that does not contain the origin to a position where the body contains 

1 1 2 

the origin may increase var ||X|| substantially. The next lemma states the 
basic recurrence used in the proof. 

Lemma 24 (recurrence). Let T(n, f,V) be the infimum o/var ||X|| 2 among 
all polytopes in R n with volume at least V , with at most f facets and con- 
tained in the ball of radius R > 0. Then there exist constants ci,C2,C3 > 
such that 




n var ||X 




T(n,f,V)> (l-^)r L-1./ + 2 




V \!+^r 



) 



R 2 \Rf 




31 



(Of course, T depends on R, but we omit that dependence to simplify 
the notation, given that, in contrast with the other parameters, R is the 
same for all appearances of T.) 

Proof. Let P be a polytope as in the statement (not necessarily minimal). 
Let U be the nearest point to the origin in P. We will use more than one 
argument, depending on the case: 

Case 1: (origin) G P. 

For every facet F of P, consider the cone Cp obtained by taking the 
convex hull of the facet and the origin. Consider the affine hyperplane Hp 
determined by F. Let U be the nearest point to the origin in Hp. Let Yp 
be a random point in Cp, and decompose it into a random point Xp + U in 
F and a scaling factor t € [0, 1] with a density proportional to t n_1 . That 
is, Yp = t(Xp + U). We will express var ||Yf|| 2 as a function of var ||Xp|| 2 . 

We have that \\Y F \\ 2 = t 2 (\\U\\ 2 + \\X F \\ 2 ). Then, 

var||y F || 2 =(Et 4 )var||X F || 2 

+ (vart 2 )(||C/|| 4 + (E ||X F || 2 ) 2 + 2\\U\\ 2 E\\X F \\ 2 ) 

Now, for k > 

Jc n 



n + k 
and 

2 _ An Cl 

(n + 4)(n + 2) 2 ~ n 2 

for c\ = 1/2 and n > 3. This in (|16j) gives 

var llYpf > var ||X F || 2 + ^ (ll^l| 4 + (E ||^f|| 2 ) 2 + 2||[/|| 2 E ||X F || 2 

>^-var||X F || 2 + ^(E||X F || 2 ) 2 . 
n + 4 " " n 2 v " " y 



Now, by means of Lemma [T71 we have that 

EWXpfy^Vn-iiFf/^in-l) 
and this in (|17p implies for some constant C3 > that 

var||y F || 2 > -^—™\\Xpf + czV n - X {Ffl^). 
n + 4 



(17) 
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Using this for all cones induced by facets we get 
varll^ll 2 > vol Cp var II YfII 2 



volP 

F facet 



^ ]T V olC F (^var||X F || 2 + C3 K-i( J F) 4/( ^ 1) ) 



> 

vol _ 

F facet 



. (18) 

Now we will argue that var \\Xp\\ is at least T(n — 1, /, -^j) for most facets. 
Because the height of the cones is at most R, we have that the volume of 
the cones associated to facets having V n -\{F) < volP/a is at most 

n a 

That is, the cones associated to facets having V n -i(F) > volP/a are at 
least a 

an 

fraction of P. For a = Rf we have that a 1 — 1/n fraction of P is composed 
of cones having facets with V n —i(F) > vol P/(Rf). Let T be the set of these 
facets. The number of facets of any facet F of P is at most /, which implies 
that for F £ T we have 

var||X F || 2 >T(n-l,/,-^). 

Then (fTHl) becomes 



Ml^ll 2 > -T15 E volC > f^var H^f + cgK-iCF) 4 ^- 1 ) 
vol P f— ' V n + 4 



> — E volcJ^T L-l,/,i-Vc 3 f— 1 



volP ^ + 4 V W W 



4/(n-l) N 



4-=)(^("-^M*) 

for some constants 05,04 > 0. 
Case 2: (slicing) 

c4 / F x4/(n-l) 



4/(n-l) V 



varE(||X|| 2 |X. [/)>/? 
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In this case, using Lemma [19] 

var ||X|| 2 = Evar(||X|| 2 | X ■ U) + varE(||X|| 2 | X ■ U) 
> Evar(||X|| 2 | X ■ U) + (3 

Call the set of points X E P with some prescribed value of X ■ U a slice. 
Now we will argue that the variance of a slice is at least T(n— 1, /, 2^"r) f° r 
most slices. Because the width of P is at most 2R, we have that the volume 
of the slices S having V n -i(S) < V/a is at most 2RV/a. That is, the slices 
having V n -\(S) > V/a are at least a 1 — 2R/a fraction of P. For a = 2nR, 
we have that a 1 — 1/n fraction of P are slices with V n -i(S) > V/(2nR). 
Let S be the set of these slices. The number of facets of a slice is at most /, 
which implies that for S £ S we have var(||X|| 2 | X S S) > T(n— 1, /, 27[r) • 
Then (1191) becomes 



var ||X|| 2 >ll--]TU-l,/,— -1+^ 



1 \ / 7\ c 4 f V \ 



n 



2nR) + 16 \Rf) 



4/(n-l) 



Case 3: (translation) var(X ■ U) < fi and varE(||X|| 2 | X • U) < f3. 
Let X = X -U. We have, 

var ||X|| 2 = var ||X || 2 + 4 var X • U + 4cov(X • U, \\X \\ 2 ). (20) 

Now, Cauchy-Schwartz inequality and the fact that cov(A, B) = cov(A, K(B \ A)) 
for random variables A, B, give 

cov{X • U, \\Xq\\ 2 ) = cov(X ■ U, \\X\\ 2 -2X-U+ \\U\\ 2 ) 
= cow{X ■ U, \\X\\ 2 ) - 2varX • U 
= cov(X ■ U,E(\\X\\ 2 | X -U))- 2 var X ■ U 



> -VvarX • U^varE(\\X\\ 2 \ X ■ U) - 2 var X ■ U. 

This in ([20} gives 

var ||X|| 2 > var ||X || 2 - 4 var X ■ U - 4VvarX • U^v&r E(\\X\\ 2 \ X ■ U) 
>vav\\X \\ 2 -8p. 

Now, Xq is a random point in a translation of P containing the origin, and 
thus case 1 applies, giving 



Rf) + 2 \Rf) 
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Case 4- (partition) otherwise: 

We want to control v&r X ■ U to be able to apply the third case. To this 
end, we will subdivide P into parts so that one of previous cases applies to 
each part. Let Pi = P, let Ui be the nearest point to the origin in Pi (or, if 
Pi is empty, the sequence stops), let Ui denote C/j/||C/j||, 

Qi = Pi n : \\Ui\\ <Ui-x< \\Ui\\ + </p/r} , 

and P i+ i = Pi\ Qi. Observe that \\U i+ i\\ > \\Ui\\ + y/J3/R and ||£/j|| < R, 
this implies that i < R 2 / 1 \f]5 and the sequence is always finite. 

For any % and by definition of Qi we have var(X ■ Ui X G Qi) = 
||[/;|| 2 var(X • Ui | X e Qi) < P. 

The volume of the parts Qi having vol Qi < V/a is at most ^jtjj . That 

is, the parts having vol Qi > V/a are at least a 1 — fraction of P. 

For a = nR 2 /^JJi we have that a 1 — 1/n fraction of P are parts with 
vol(Qj) > V\fj3/(nR 2 ). Let Q be the set of these parts. The number of 
facets of a part is at most / + 2. Thus, applying one of the three previous 
cases to each part in Q, and using that / > n, 

varll^f > V volQvar(||X|| 2 I X e Q) 

volP ^ 

> (1-r) I ( 1 -^) T ( n - 1 ' / + 2 ' n ^3 ma ^ /)2n }) + l|(^R^) 
In any of these cases, 

™ * (> - ?) 4 - ' + 2 ' 4 -K 1 - $)) + - {If min ( h 3 

(21) 

Now, by assumption, V < 2 n R n , and this implies by definition that 




4/(n-l) 



nR 2 

That is, 



mm 1, — — r = O 



nR 2 J \nR 2 

and the lemma follows, after replacing the value of (3 in Equation (I2ip . □ 
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Proof (of Theorem^). The inequality claimed in the theorem is invariant 
under (uniform) scaling (which would change the volume as well as the 
radius of the circumscribed sphere), and thus for the proof we can assume 
that volP = 1, without loss of generality. For n > 13, this implies that 
R > 1. We use the recurrence lemma in a nested way t = n/logn timeg^|. 
The radius R stays fixed, and the number of facets involved is at most 
f + 2t < 3/. Each time, the volume is raised to the power of at most 1 + 
and divided by at most 



u := c'nR 2 (R(f + 2t)f + ^ > 1, 

for d = max(c^ , 1). That is, after t times the volume is at least (using the 
fact that (1 + = 0(1) and denoting v = 1 + ^) 



u 



£i= V = ( c 'nR 2 (R(f + 2t)) 1+ —A n ~ f > l/(3c'n J R 3 /)°W 



That means that from the recurrence inequality we get (we ignore the ex- 
pression in "?", as we will discard that term): 

T(n, /, 1) > (l - ^)'T(n -t,f + 2t, ?) + 

+ c , t d _ siy- 1 I ( _J I 

3 V n) R*/(n-t-i) \3Rf {?>c'nRZf)0{t) 
We discard the first term and simplify to get, 

/ 1 X 0(1/ log n) 



r(n,/,i)> 



logn \R?f 



Thus, for a polytope of arbitrary volume we get by means of a scaling that 
there exists a universal constant c > such that 



var||X|| 2 >(volP) 4 /^ (volP)3A 



c/ log n 



n 



R 3 f I log n 

The theorem follows. □ 



2 To force t to be an integer would only add irrelevant complications that we omit. 



36 



8 Discussion 



The results for determinant /volume hold with the following stronger oracle: 
we can specify any k x k submatrix A' of A and a vector x G M. k and 
ask whether \\A' 

■^||oo — !■ In particular, this allows us to cjuery individual 
entries of the matrix. More specifically, consider the oracle that takes indices 
i, j and a £ R and returns whether Au < a. Using this oracle, our proof 
(Lemma|7} yields the following result: there is a constant c > such that any 
randomized algorithm that approximates the determinant to within a (1 + c) 
factor has complexity 0(n 2 ). In the property testing framework, this rules 
out sublinear (in the input size) methods for estimating the determinant, 
even with randomized (adaptive) access to arbitrary entries of the input 
matrix. 

A posteriori, the way the volume lower bound is proved resembles an 
idea used in communication complexity: discrepancy lower bounds. In that 
idea, one gives an upper bound to the size of "almost monochromatic rect- 
angles", which implies a lower bound on the number of rectangles and, thus, 
the communication complexity of the given function. In our case, we give 
an upper bound to the measure of product sets where the determinant does 
not change too much. Moreover, our results imply a lower bound for the 
following multi-party problem: There are n players, player i gets to know 
only the ith row of a given n x n real matrix A, and they want to approx- 
imate |det A| up to a multiplicative constant. Then in any protocol where 
each of them broadcasts bits, they must broadcast 0(n 2 /logn) bits, even 
for randomized protocols succeeding with high probability and even if the 
matrix is restricted to be far from singular as in Theorem [3j 

In our lower bounds for the product, the error bound is 1 + c/logn, 
where the logarithmic factor comes from the variance lemma. It is an open 
problem as to whether this factor can be removed in the variance lower 
bound. 

For the volume problem itself, the best known algorithm has complexity 
roughly 0(n 4 ) but the complexity of that algorithm is conjectured to be 
n 3 . It is conceivable that our lower bound for membership oracle queries 
can be improved to n 3 , although one would have to use bodies other than 
parallelepipeds. Also, it is an open problem to give a faster algorithm using 
a separation oracle. 

Finally, we hope that the tools introduced here are useful for other prob- 
lems. 
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